智能论文笔记

On Quantum Speedups for Nonconvex Optimization via Quantum Tunneling Walks

Yizhou Liu , Weijie J. Su , Tongyang Li

分类：机器学习

2022-09-29

经典算法通常对于解决非障碍最小值的非凸优化问题通常无效。在本文中，我们通过利用量子隧道的全局效应来探讨非凸优化的量子加速。具体而言，我们引入了一种称为量子隧道步行（QTW）的量子算法，并将其应用于局部最小值大约全局最小值的非凸问题。我们表明，当不同局部最小值较高但薄且最小值平坦时，QTW在经典随机梯度下降（SGD）上实现了量子加速。基于此观察结果，我们构建了一个特定的双孔景观，其中经典算法无法有效地击中一个目标，但是QTW可以在已知井附近提供适当的初始状态时可以很好地击中一个目标。最后，我们通过数值实验证实了我们的发现。

translated by 谷歌翻译

When does SGD favor flat minima? A quantitative characterization via linear stability

Lei Wu , Mingze Wang , Weijie Su

分类： (统计)机器学习 | 机器学习

2022-07-06

随机梯度下降（SGD）有利于最小值的观察结果在理解SGD的隐式正则化和指导超参数调整方面发挥了基本作用。在本文中，我们通过将SGD的特定噪声结构与其\ emph {线性稳定性}相关联（Wu et al。，2018），对这种引人注目的现象提供了定量解释。具体而言，我们考虑培训具有正方形损失的过度参数化模型。我们证明，如果全局最低$ \ theta^*$是线性稳定的，则必须满足$ \ | h（\ theta^*）\ | _f \ leq o（\ sqrt {b}/\ eta）$ ，其中$ \ | h（\ theta^*）\ | _f，b，\ eta $分别表示Hessian的Frobenius Norm，分别为$ \ theta^*$，批处理大小和学习率。否则，SGD将快速逃离该最小值\ emph {指数}。因此，对于SGD可访问的最小值，通过Hessian的Frobenius Norm衡量的平坦度与模型尺寸和样本尺寸无关。获得这些结果的关键是利用SGD噪声的特定几何学意识：1）噪声幅度与损失值成正比； 2）噪声方向集中在当地景观的尖锐方向上。 SGD噪声的这种属性证明是线性网络和随机特征模型（RFM），并在非线性网络进行了经验验证。此外，我们的理论发现的有效性和实际相关性是通过广泛的数值实验证明的。

translated by 谷歌翻译

A Truthful Owner-Assisted Scoring Mechanism

Weijie J. Su

分类：机器学习

2022-06-14

爱丽丝（所有者）了解其成绩测量的物品的潜在质量。鉴于独立方提供的嘈杂成绩，鲍勃（评估者）可以通过向爱丽丝提出有关成绩的问题来获得对项目的基本真相的准确估计？当对爱丽丝的回报是她所有物品的加性凸实用性时，我们将解决这个问题。我们确定，如果爱丽丝必须真实地回答这个问题，以使她的回报得到最大化，则必须将问题作为其物品之间的成对比较提出。接下来，我们证明，如果要求爱丽丝（Alice）提供其物品的排名，这是通过成对比较的最细粒度的问题，她将是真实的。通过纳入基本真相排名，我们表明鲍勃可以根据任何可能的真实信息启发的方式获得在某些策略中具有最佳平方错误的估计器。此外，当项目数量较大并且原始等级非常嘈杂时，估计的等级比原始等级要准确得多。最后，我们以几次扩展和一些改进为总结，以进行实际考虑。

translated by 谷歌翻译

Analytical Composition of Differential Privacy via the Edgeworth Accountant

Hua Wang , Sheng Gao , Huanyu Zhang , Milan Shen , Weijie J. Su

分类：机器学习 | (统计)机器学习

2022-06-09

许多现代的机器学习算法由简单的私人算法组成；因此，一个越来越重要的问题是有效计算组成下的整体隐私损失。在这项研究中，我们介绍了Edgeworth会计师，这是一种分析方法，用于构成私人算法的差异隐私保证。 Edgeworth会计师首先使用$ f $ - 不同的隐私框架来无误地跟踪构图下的隐私损失，该框架使我们能够使用隐私损失log-logikelihoodhiehood（pllrs）表达隐私保证。顾名思义，该会计师接下来使用Edgeworth扩展到上下界限PLLR的总和的概率分布。此外，通过依靠一种使用简单的技术近似复杂分布的技术，我们证明了Edgeworth会计师可以应用于任何噪声加成机制的组成。由于Edgeworth扩展的某些吸引人的功能，该会计师提供的$（\ epsilon，\ delta）$ - 差异隐私范围是非反应的，基本上没有额外的计算成本，而不是先前的方法运行时间随成分的数量而增加。最后，我们证明了我们的上和下部$（\ epsilon，\ delta）$ - 差异隐私范围在联合分析和培训私人深度学习模型的某些制度中紧密。

translated by 谷歌翻译

ROCK: Causal Inference Principles for Reasoning about Commonsense Causality

Jiayao Zhang , Hongming Zhang , Weijie J. Su , Dan Roth

分类：自然语言处理 | 人工智能 | 机器学习

2022-01-31

常识性因果关系推理（CCR）旨在确定普通人认为合理的自然语言描述中合理的原因和影响。尽管缺乏良好的理论框架，但仍然存在很大的学术和实践兴趣，但仍然受到了这个问题的影响。现有的工作通常全心全意地依赖于深层语言模型，并且可能容易受到混淆的共发生。由经典因果原则的促进，我们表达了CCR的主要问题，并在观察性研究和自然语言中与人类受试者之间的相似之处，以采用CCR对潜在的遇到框架，这是第一次进行常识任务的尝试。我们提出了一个新颖的框架岩石，以推理o（a）回合常识性k（c）技术，该仪式利用时间信号作为偶然的监督，并使用类似于倾向得分的时间倾向来混淆效果。岩石实施是模块化的，零射，并且表现出良好的CCR功能。

translated by 谷歌翻译

Neurashed: A Phenomenological Model for Imitating Deep Learning Training

Weijie J. Su

分类：机器学习 | 计算机视觉 | (统计)机器学习

2021-12-17

为了推进未来十年的深度学习方法，需要一种理论框架，用于推理现代神经网络。虽然努力逐渐变得逐渐变得令人生畏为什么深度学习如此有效，但仍然缺乏一面全面的画面，这表明可以更好的理论。我们认为未来的深度学习理论应该继承三种特征：a \ texit {分层}结构化网络架构，参数\ texit {迭代}使用基于随机梯度的方法进行优化，以及来自演变的数据的信息\ texit {crecture}。作为实例化，我们将这些特征集成到称为\ Textit {Neurashed}的图形模型中。该模型有效地解释了深度学习中的一些常见的经验模式。特别是，Neurashed能够实现隐含的正则化，信息瓶颈和局部弹性。最后，我们讨论了神经内容如何引导深层学习理论的发展。

translated by 谷歌翻译

You Are the Best Reviewer of Your Own Papers: An Owner-Assisted Scoring Mechanism

Weijie J. Su

分类：机器学习 | (统计)机器学习

2021-10-27

我认为审阅者为选择高质量的项目提供了非常嘈杂的分数（例如，对大型会议记录的同行评审），而这些项目的所有者知道真正的基础分数，但不愿意提供此信息。。为了解决此信息的这种预扣，我介绍了等渗机制，这是一种简单有效的方法，可以通过利用所有者被激励提供的某些信息来改善不精确的原始分数。除了审稿人提供的原始分数外，该机制将项目从所有者提供的最佳到最差的项目排名。它通过解决凸优化问题来报告项目调整后的分数。在某些条件下，我表明所有者的最佳策略是诚实地向她的最佳知识报告这些项目的真实排名，以最大程度地提高预期的实用程序。此外，我证明，该所有者辅助机制提供的调整得分比审阅者提供的原始分数要准确得多。本文以等渗机制的几个扩展以及对实际考虑的机制进行了一些改进的结论。

translated by 谷歌翻译

Deformable DETR: Deformable Transformers for End-to-End Object Detection

Xizhou Zhu , Weijie Su , Lewei Lu , Bin Li , Xiaogang Wang , Jifeng Dai

分类：

2020-10-08

DETR has been recently proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance. However, it suffers from slow convergence and limited feature spatial resolution, due to the limitation of Transformer attention modules in processing image feature maps. To mitigate these issues, we proposed Deformable DETR, whose attention modules only attend to a small set of key sampling points around a reference. Deformable DETR can achieve better performance than DETR (especially on small objects) with 10× less training epochs. Extensive experiments on the COCO benchmark demonstrate the effectiveness of our approach. Code is released at https:// github.com/fundamentalvision/Deformable-DETR.

translated by 谷歌翻译

Informing selection of performance metrics for medical image segmentation evaluation using configurable synthetic errors

Shuyue Guan , Ravi K. Samala , Weijie Chen

分类：计算机视觉

2022-12-30

Machine learning-based segmentation in medical imaging is widely used in clinical applications from diagnostics to radiotherapy treatment planning. Segmented medical images with ground truth are useful for investigating the properties of different segmentation performance metrics to inform metric selection. Regular geometrical shapes are often used to synthesize segmentation errors and illustrate properties of performance metrics, but they lack the complexity of anatomical variations in real images. In this study, we present a tool to emulate segmentations by adjusting the reference (truth) masks of anatomical objects extracted from real medical images. Our tool is designed to modify the defined truth contours and emulate different types of segmentation errors with a set of user-configurable parameters. We defined the ground truth objects from 230 patient images in the Glioma Image Segmentation for Radiotherapy (GLIS-RT) database. For each object, we used our segmentation synthesis tool to synthesize 10 versions of segmentation (i.e., 10 simulated segmentors or algorithms), where each version has a pre-defined combination of segmentation errors. We then applied 20 performance metrics to evaluate all synthetic segmentations. We demonstrated the properties of these metrics, including their ability to capture specific types of segmentation errors. By analyzing the intrinsic properties of these metrics and categorizing the segmentation errors, we are working toward the goal of developing a decision-tree tool for assisting in the selection of segmentation performance metrics.

translated by 谷歌翻译

Examining Political Rhetoric with Epistemic Stance Detection

Ankita Gupta , Su Lin Blodgett , Justin H Gross , Brendan O'Connor

分类：自然语言处理

2022-12-29

Participants in political discourse employ rhetorical strategies -- such as hedging, attributions, or denials -- to display varying degrees of belief commitments to claims proposed by themselves or others. Traditionally, political scientists have studied these epistemic phenomena through labor-intensive manual content analysis. We propose to help automate such work through epistemic stance prediction, drawn from research in computational semantics, to distinguish at the clausal level what is asserted, denied, or only ambivalently suggested by the author or other mentioned entities (belief holders). We first develop a simple RoBERTa-based model for multi-source stance predictions that outperforms more complex state-of-the-art modeling. Then we demonstrate its novel application to political science by conducting a large-scale analysis of the Mass Market Manifestos corpus of U.S. political opinion books, where we characterize trends in cited belief holders -- respected allies and opposed bogeymen -- across U.S. political ideologies.

translated by 谷歌翻译